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ABSTRACT 

The Atlas of Genetics and Cytogenetics in Oncology 
and Haematology (http://AtlasGeneticsOncology. 
org) is a peer-reviewed internet journal/encyclo- 
paedia/database focused on genes implicated in 
cancer, cytogenetics and clinical entities in cancer 
and cancer-prone hereditary diseases. The main 
goal of the Atlas is to provide review articles that 
describe complementary topics, namely, genes, 
genetic abnormalities, histopathology, clinical diag- 
noses and a large iconography. This description, 
which was historically based on karyotypic 
abnormalities and in situ hybridization (fluorescence 
in situ hybridization) techniques, now benefits from 
comparative genomic hybridization and massive 
sequencing, uncovering a tremendous amount of 
genetic rearrangements. As the Atlas combines 
different types of information (genes, genetic abnor- 
malities, histopathology, clinical diagnoses and 
external links), its content is currently unique. The 
Atlas is a cognitive tool for fundamental and clinical 
research and has developed into an encyclopaedic 
work. In clinical practice, it contributes to the 
cytogenetic diagnosis and may guide treatment 
decision making, particularly regarding rare 
diseases (because they are numerous and are fre- 
quently encountered). Readers as well as the 
authors of the Atlas are researchers and/or clinicians. 



INTRODUCTION 

Why the Atlas? 

Cancer, the second most common cause of death in the 
developed countries (~27%), is the first cause of untimely 
death (>35% of deaths before the age of 65 years). Cancer 
is a genetic disorder. The prognosis of a leukaemia 
depends on the genes involved; median survival is 3 
months in case of an inv(3)(q21q26) (RPN1/EVI1), 
whereas 95% of patients with a dic(9;12)(pl3;pl3) 
(PAX5/ETV6) are cured. Treatments depend on the 
severity of the disease. However, there are >900 leukaemia 
entities! In addition to the small set of a hundred or so 
genes are known to play a major role in cancer, 2000-9000 
other genes are possibly implicated in cancer, 1200 types 
of solid tumours exist and 'n' hereditary diseases are 
cancer-prone conditions. Also, 25 000 new publications 
on cancer genetics in man are added annually to PubMed. 

The Atlas (1-5), which began in 1997, is an internet 
journal/encyclopaedia/database, and, as such, it is a 
hybrid formula. Like any scientific journal, it is peer 
reviewed and, like any internet site, it is easy to use (hyper- 
links, updates, automation in data mining). The Atlas 
focuses on genes involved in cancer, cytogenetics and 
clinical entities in cancer and cancer-prone diseases. This 
is the place where complementary topics, namely, genes, 
genetic abnormalities, histopathology and clinical diag- 
noses are developed and hyperlinked. We also have estab- 
lished links to the major external databases in the related 
field(s). It is a collective undertaking by researchers and 
clinicians aimed at describing the state of the art in cancer 
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genetics for the medical and scientific community and 
providing a cognitive tool for fundamental and clinical 
research. 

Readers of the Atlas are consultants in hospitals, re- 
searchers, university teachers, but students of medicine 
and of the life sciences also use the Atlas. 



DATABASE CONTENTS 

About 2200 authors have so far contributed to the Atlas, 
making 2167 review articles available. The Atlas contains 
peer-reviewed articles on 1135 genes, 503 leukaemia 
entities, 177 solid tumours and 104 cancer-prone inherited 
diseases. It also contains 'automated cards' on 8190 other 
genes potentially implicated in cancer. Automated cards 
are produced in the following manner: a list of genes is 
obtained from the gene directory of the NCBI (6), filtered 
by keywords related to cancer in Description, GO Terms 
and GeneRIF fields. Location data are selected from the 
refGene.txt file on the UCSC Genome server (7). 
Automated cards contain data released into the public 
domain and hyperlinks towards the main databases. The 
automated cards section will be extended to encompass 
the entire genome, that is, ~30000 genes. The Atlas also 
contains traditional articles, called 'Deep Insights' (81 art- 
icles to date), dealing with topics in areas related to core 
subjects in the Atlas [e.g. chromothripsis, centrosome, 
autophagy and so forth (8-10)], 63 case reports on haem- 
atological malignancies and 124 chapters in the educa- 
tional section (with ~40 in English, Spanish and 
French). Whenever possible, articles on leukaemias or 
solid tumours contain data on the prognosis, to help clin- 
icians, and an iconography of the chromosomes and histo- 
pathology specimens for the biologist's diagnosis. The 
Atlas contains an iconography of ~11 800 images. 

Articles can be accessed either by theme (genes, leukae- 
mias, solid tumours, cancer-prone disease) or by chromo- 
some. In the latter case, genes may be displayed in 
alphabetical order or in physical order from pter to qter, 
according to human genome assembly hgl9 [February 
2009, UCSC Genome browser (7)]. The 'Case Reports in 
haematology' section was created to allow the delineation 
of new leukaemia entities, describing and revealing the 
natural history/epidemiology of a rare disease with no 
previous clinical description and an unknown prognosis. 
This contributes to applied research in epidemiology, and 
it will allow better therapeutic decisions. A portal with 
numerous links devoted to genetics and/or cancer adds 
to the information available in the Atlas. 



RECENT DEVELOPMENTS 

Cell biology corner 

The various articles related to a given topic in the field of 
cell biology (e.g. apoptosis, cell cycle, micro RNA, nuclear 
membrane), as well as topics related to physiopathology 
(e.g. epithelial-mesenchymal transition or angiogenesis) 
are collected and grouped together in this section, along 
with a selected iconography (Figure 1). 



Atlas database 

In 1997, the Atlas templates consisted of concise 
structured files. The model had to evolve with novel de- 
velopments (new areas being explored, such as the cell 
biology corner, and the fact that full review articles grad- 
ually replaced the concise files). The templates of the 
articles in the database had to be modifiable. This was 
done by describing headings, sub-headings and the type 
of information they contain, with no limits on the hier- 
archy of the sub-headings, so that the database structure 
could evolve constantly. The concepts of meta object 
facilities (MOF) (11) http://dl.acm.org/citation. 
cfm?id = 1 028976. 1 029004&coll = DL&dl = GUIDE were 
used to implement such a process. MOF makes it 
possible to describe not only the data but also the way it 
is structured in the same environment. MOF offers the 
possibility of changing dynamically allocated data struc- 
tures and of applying these evolutions to the data without 
re-programming applications. It consists of a fixed 
meta-model that describes the data representation model 
and the resulting model that stores the data. Although 
tools exist to create such a database, they are still experi- 
mental and cumbersome to implement for users. A hybrid 
model has been created, using MOF exclusively for the 
concepts concerning the structuring of the articles. The 
other data structures used to manage the application are 
implemented in a relational database (Figure 2). In 
addition, an automatic survey of different databases 
permits an updated list of consistent identifiers that are 
used to generate the external links for all the genes 
defined in the Atlas. 



Open access electronic journal 

An 'open access journal' version of the Atlas is now avail- 
able on the digital publishing platform I-Revues of the 
Institute for Scientific and Technical Information 
(INIST) of the French National Centre for Scientific 
Research (CNRS). It presents the archives of a quarterly 
journal since 1997, which became a bimonthly journal in 
2008 and a monthly journal in 2009, comprising 2215 
articles in 90 volumes, which constitutes a 7465-page col- 
lection, available at: http://documents.irevues.inist.fr/ 
handle/2042/15655. DSpace software, an open source re- 
pository software package, is used. It is based on banks of 
digital objects described by a set of standardized metadata 
in extensible mark-up language format (Dublin Core 
qualified) and a standard uniform identifier (CNRI 
handle). Metadata are also available on the Web in 
Dublin Core extensible mark-up language format and 
are freely harvestable. DSpace supports the common 
interoperability standards used in the institutional reposi- 
tory domain, such as the Open Archives Initiative 
Protocol for Metadata Harvesting (OAI-PMH protocol). 
A digital object identifier (DOI®), recorded at the Inter- 
national Agency CrossRef, is assigned to each article, and 
it ensures the sustainability of their ranking and enables 
publishers to create direct links between scientific articles 
and also with databases. 
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Deep Insight 

The nuclear pore complex becomes alive: new insights into its dynamics and 
involvement in different cellular processes 
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Figure 1. An example of a cell biology cluster, where articles and iconography concerning a given theme are assembled and are illustrated here by 
the nuclear membrane topic. 
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Figure 2. A simplified representation of the Atlas database. Structures are described in tables Tbloc, Tentity and datatype. The contents of the 
articles are stored in Tables Block, Entity, Images and so forth. DataType contains a list of data types that are used to create an article, such as text, 
images, links and so forth. The data are stored in a specific table for each type. Tentity describes the entities that make up a block. TBIoc describes 
the structuring topic and sub-topic articles. The articles are classified into different categories (leukaemia, gene and so forth), each class having its 
own section structure. To create a new category of articles, the publishing tool will search for any structure that must be implemented by querying 
the database metadata (TBIoc, TEntite, DataType). 
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FUTURE DIRECTIONS 

Description of recurrent anomalies, gene fusions and other 
cytogenomic-acquired alterations 

The main goal of the Atlas is to describe the recurrent 
anomalies in the human genome implicated in oncological 
processes, the genes involved and the various diseases in 
which they are implicated. This description, which was 
historically based on karyotypic abnormalities and in 
situ hybridization (fluorescence in situ hybridization) 
techniques, now benefits from comparative genomic 
hybridization and massive sequencing. The field of 
cytogenomic oncology has exploded with copy number, 
single-nucleotide polymorphism and loss of heterozygosity 
techniques and next-generation genome sequencing, un- 
covering a tremendous amount of acquired genomic alter- 
ations (polymorphisms, mutations, copy number 
alterations, rearrangements, duplications, inversions, 
translocations, with gene fusions or re-locations), confirm- 
ing/revealing the extreme complexity of cellular alterations 
at the origin of cancer and during cancer progression. The 
result is a massive amount of new data. All of these data 
are currently scattered across various public databases or 
may be found exclusively in publications. 

The Atlas plans to browse these data from the 
Mitelman database (12) http://cgap.nci.nih.gov/ 
Mitelman/ and the Cancer Genome Project (13) and 
COSMIC (as a reference for mutations) (14) to integrate 
all these data sets and make them available by chromo- 
somal band. The Atlas also intends to include the location 
of diagnostic probes. 

Orphan chromosome anomalies 

A chromosomal abnormality in a leukaemia patient is 
said to be a 'driver' anomaly (driving to carcinogenesis), 
either when a known oncogene is implicated in the re- 
arrangement or when at least two cases have been 
described with the same breakpoints. Consequently, all 
the laboratories have 'single' orphan cases supposedly 
'passenger' anomalies, pending on the back burner. An 
'orphan chromosomal abnormalities' module will 
provide an area where cytogeneticists will be able to 
deposit the description of yet unknown anomalies, with 
complete clinical, cytological and cytogenetic data. As 
soon as a second case is deposited alongside a first case 
in the repository, the anomaly will become ipso facto a 
'driver' anomaly, warranting further studies and scientific 
communications. 

Scientific and professional societies 

Finally, the Atlas has so far relied on institutional funds. 
It recently launched a public appeal for donation and 
grants because of cash flow problems, and some individ- 
uals as well as scientific societies responded positively. 
A further step we would welcome is that scientific and 
professional societies in genetics, cancer genetics and on- 
cology areas become involved in the scientific (and finan- 
cial) long-term development of the Atlas. 



ELECTRONIC ADDRESSES 

http://AtlasGeneticsOncology.org Atlas of Genetics 
and Cytogenetics in Oncology and Haematology Home 
Page. 

http://chromosomesincancer.org The non-profit associ- 
ation ARMGHM's Home Page, whose goal is to host and 
handle the Atlas and receive donations. 

http://documents.irevues.inist.fr/handle/2042/15655 
Archives of the Atlas Journal. 



CITING THE ATLAS 

If you use the Atlas of Genetics and Cytogenetics in 
Oncology and Haematology in your published research, 
please, cite this article. 
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